Fallback to CPU when video isn't supported by NVDEC #977

NicolasHug · 2025-10-16T18:19:05Z

Currently in our BETA CUDA Interface, we error loudly if we find that NVDEC doesn't support the input video.

In this PR, we do not error loudly anymore: we fall back to the CPU. This aligns the behavior of the BETA Interface with that of the FFmpeg interface, since FFmpeg was doing the fallback internally already (see #943).

There are some key design choices and which are better explain in context, so I'll leave some comments below.

…terface

…ceive-frame-send-packet

…xu/torchcodec into refactor-receive-frame-send-packet

…tter

NicolasHug · 2025-10-17T11:29:20Z

src/torchcodec/_core/BetaCudaDeviceInterface.cpp

  return decoder->frameReadyInDisplayOrder(dispInfo);
 }

 static UniqueCUvideodecoder createDecoder(CUVIDEOFORMAT* videoFormat) {


All checks below have been moved to the new bool nativeNVDECSupport(const SharedAVCodecContext& codecContext);

NicolasHug · 2025-10-17T11:32:17Z

src/torchcodec/_core/BetaCudaDeviceInterface.cpp

+      return std::nullopt;
+  }
+}
+


These are the same checks as before, but with a crucial difference: the source of truth that we use to get the stream properties is now FFmpeg's AVCodecContext. Previously, it was NVCUVID's CUVIDEOFORMAT.
I think that's fine. I can't imagine NVCUVID being more accurate than FFmpeg.

NicolasHug · 2025-10-17T11:35:57Z

src/torchcodec/_core/BetaCudaDeviceInterface.cpp

+    }
+    return;
+  }
+


Above is essentially the same as the FFmpeg interface logic:

torchcodec/src/torchcodec/_core/CudaDeviceInterface.cpp

Lines 250 to 290 in 3827dfe

if (avFrame->format != AV_PIX_FMT_CUDA) {

// The frame's format is AV_PIX_FMT_CUDA if and only if its content is on

// the GPU. In this branch, the frame is on the CPU. There are two possible

// reasons:

//

// 1. During maybeConvertAVFrameToNV12OrRGB24(), we had a non-NV12 format

// frame and we're on FFmpeg 4.4 or earlier. In such cases, we had to

// use CPU filters and we just converted the frame to RGB24.

// 2. This is what NVDEC gave us if it wasn't able to decode a frame, for

// whatever reason. Typically that happens if the video's encoder isn't

// supported by NVDEC.

//

// In both cases, we have a frame on the CPU. We send the frame back to the

// CUDA device when we're done.

enum AVPixelFormat frameFormat =

static_cast<enum AVPixelFormat>(avFrame->format);

FrameOutput cpuFrameOutput;

if (frameFormat == AV_PIX_FMT_RGB24) {

// Reason 1 above. The frame is already in RGB24, we just need to convert

// it to a tensor.

cpuFrameOutput.data = rgbAVFrameToTensor(avFrame);

} else {

// Reason 2 above. We need to do a full conversion which requires an

// actual CPU device.

cpuInterface_->convertAVFrameToFrameOutput(avFrame, cpuFrameOutput);

}

// Finally, we need to send the frame back to the GPU. Note that the

// pre-allocated tensor is on the GPU, so we can't send that to the CPU

// device interface. We copy it over here.

if (preAllocatedOutputTensor.has_value()) {

preAllocatedOutputTensor.value().copy_(cpuFrameOutput.data);

frameOutput.data = preAllocatedOutputTensor.value();

} else {

frameOutput.data = cpuFrameOutput.data.to(device_);

}

return;

}

There are still minor differences that we should eventually align, as part of #943

NicolasHug · 2025-10-17T11:37:07Z

test/test_decoders.py

+            # https://github.com/meta-pytorch/torchcodec/pull/977
+            # We need to define a better way to assert which backend a decoder
+            # is using.
+            return


It's OK to deactivate this test for now, we know it's working fine.

I think we should have a private way to query the VideoDecoder object so that it can tell us which interace / backend it's currently using. I will follow-up on this.

NicolasHug · 2025-10-17T11:40:37Z

test/test_decoders.py

-            dec.get_frame_at(0)
+            beta = VideoDecoder(H265_VIDEO.path, device="cuda").get_frame_at(0)
+
+        torch.testing.assert_close(ffmpeg.data, beta.data, rtol=0, atol=0)


Above: we used to assert an error, now we just assert that the video can be decoded properly. Strictly speaking, we're not asserting that there is a fallback. (That will probably come in follow-ups).

There's another important thing to check, that we're actually not enforcing in the tests: we need to make sure that we don't have false positives, i.e. that we're not falling back to the CPU when we don't need to. To check that, I added a TORCH_CHECK(false) in the CPU fallback path. All the tests are passing, except for these two:

FAILED test/test_decoders.py::TestVideoDecoder::test_get_key_frame_indices[cuda:beta] - RuntimeError: Falling back to CPU decoder. FAILED test/test_decoders.py::TestVideoDecoder::test_beta_cuda_interface_cpu_fallback - RuntimeError: Falling back to CPU decoder.

This is the expected result: these two tests are using H265_VIDEO, which is the only video that needs a fallback.

I will try to follow-up with a an official and robust way to assert the backend / fallback of the VideoDecoder object (see other comment below)

NicolasHug · 2025-10-17T11:41:14Z

src/torchcodec/_core/BetaCudaDeviceInterface.cpp

+    // We'll always use the CPU fallback from now on, so we can return early.
+    return;
+  }
+


Above: we create the fallback CPU interface only if we need it.

I'm a bit sad that the BETA Interface needs to store a CPU interface. I have TODOs in other places basically saying "An interface shouldn't rely on another interface".

But, I'm not sure we have a better way to implement this fallback mechanism, so I think our principle should be something less strict along those lines: "It's OK for interface A to use interface B, but interface B shouldn't need to worry about being called by interface A". I think we're respecting this principle so far.

The main design alternative would be for the fallback to be implemented within SingleStreamDecoder rather than in the interface itself. But I'm not sure we want this.
Specifically for this fallabck, we'd have to create a third hybrid interface (the fallback one), that runs all the decoding on the CPU, but runs the color conversion (convertAVFrameToFrameOutput) on GPU. Sounds like a hassle.

"It's OK for interface A to use interface B, but interface B shouldn't need to worry about being called by interface A". I think we're respecting this principle so far.

Agreed and agreed.

In my previous refactoring of the FFmpeg CUDA device interface, I believe I had always created the CPU device for fallback. But in that device, I also think we only knew if we would need the fallback when we discovered that FFmpeg had decoded on the frame on the CPU and then we needed to do CPU color conversion. If we could discover earlier in the FFmpeg CUDA device interface, then doing this pattern would be good. But I doubt we can

scotts · 2025-10-20T10:19:10Z

src/torchcodec/_core/BetaCudaDeviceInterface.cpp

+    } else {
+      frameOutput.data = cpuFrameOutput.data.to(device_);
+    }
+    return;


We do this dance in at least one other place in the FFmpeg CUDA device. Maybe a part of #943 could also be trying to pull this into a utils function.

Co-authored-by: Molly Xu <[email protected]> Co-authored-by: Molly Xu <[email protected]>

Molly Xu and others added 13 commits October 10, 2025 14:51

Refactor receiveFrame and sendPacket logic to dispatch directly to in…

718c268

…terface

ran precommit

30ee0e8

Merge branch 'meta-pytorch:main' into refactor-receive-frame-send-packet

73fe68b

use shared_ptr for codecContext

11d7adf

Merge branch 'main' of github.com:pytorch/torchcodec into refactor-re…

1766717

…ceive-frame-send-packet

address feedback

44edb93

Merge branch 'refactor-receive-frame-send-packet' of github.com:molly…

656a3b4

…xu/torchcodec into refactor-receive-frame-send-packet

Disgustingly hacky POC

b09b203

WIP

82f5807

WIP

936de76

WIP

c5939d2

WIP

55dffd9

Init checks - yummy

e58746a

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 16, 2025

NicolasHug added 9 commits October 16, 2025 19:20

WIP

8373eec

WIP

4178e23

Merge branch 'main' of github.com:pytorch/torchcodec into fallback-be…

bf6a743

…tter

Fix merge

234e1d9

WIP

82db435

slightly simplify

f96264b

nit

6886748

Nit

982979b

consistent names

0c6e986

NicolasHug marked this pull request as ready for review October 17, 2025 11:25

NicolasHug commented Oct 17, 2025

View reviewed changes

scotts reviewed Oct 20, 2025

View reviewed changes

scotts approved these changes Oct 20, 2025

View reviewed changes

NicolasHug merged commit 1befed7 into meta-pytorch:main Oct 20, 2025
58 checks passed

NicolasHug deleted the fallback-better branch October 20, 2025 16:25

NicolasHug added a commit to NicolasHug/torchcodec that referenced this pull request Oct 27, 2025

Fallback to CPU when video isn't supported by NVDEC (meta-pytorch#977)

80826d3

Co-authored-by: Molly Xu <[email protected]> Co-authored-by: Molly Xu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fallback to CPU when video isn't supported by NVDEC #977

Fallback to CPU when video isn't supported by NVDEC #977

Uh oh!

NicolasHug commented Oct 16, 2025 •

edited

Loading

Uh oh!

NicolasHug Oct 17, 2025

Uh oh!

NicolasHug Oct 17, 2025

Uh oh!

NicolasHug Oct 17, 2025

Uh oh!

NicolasHug Oct 17, 2025

Uh oh!

NicolasHug Oct 17, 2025

Uh oh!

NicolasHug Oct 17, 2025 •

edited

Loading

Uh oh!

scotts Oct 20, 2025

Uh oh!

scotts Oct 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if (avFrame->format != AV_PIX_FMT_CUDA) {
	// The frame's format is AV_PIX_FMT_CUDA if and only if its content is on
	// the GPU. In this branch, the frame is on the CPU. There are two possible
	// reasons:
	//
	// 1. During maybeConvertAVFrameToNV12OrRGB24(), we had a non-NV12 format
	// frame and we're on FFmpeg 4.4 or earlier. In such cases, we had to
	// use CPU filters and we just converted the frame to RGB24.
	// 2. This is what NVDEC gave us if it wasn't able to decode a frame, for
	// whatever reason. Typically that happens if the video's encoder isn't
	// supported by NVDEC.
	//
	// In both cases, we have a frame on the CPU. We send the frame back to the
	// CUDA device when we're done.

	enum AVPixelFormat frameFormat =
	static_cast<enum AVPixelFormat>(avFrame->format);

	FrameOutput cpuFrameOutput;
	if (frameFormat == AV_PIX_FMT_RGB24) {
	// Reason 1 above. The frame is already in RGB24, we just need to convert
	// it to a tensor.
	cpuFrameOutput.data = rgbAVFrameToTensor(avFrame);
	} else {
	// Reason 2 above. We need to do a full conversion which requires an
	// actual CPU device.
	cpuInterface_->convertAVFrameToFrameOutput(avFrame, cpuFrameOutput);
	}

	// Finally, we need to send the frame back to the GPU. Note that the
	// pre-allocated tensor is on the GPU, so we can't send that to the CPU
	// device interface. We copy it over here.
	if (preAllocatedOutputTensor.has_value()) {
	preAllocatedOutputTensor.value().copy_(cpuFrameOutput.data);
	frameOutput.data = preAllocatedOutputTensor.value();
	} else {
	frameOutput.data = cpuFrameOutput.data.to(device_);
	}

	return;
	}

Fallback to CPU when video isn't supported by NVDEC #977

Fallback to CPU when video isn't supported by NVDEC #977

Uh oh!

Conversation

NicolasHug commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

scotts Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NicolasHug commented Oct 16, 2025 •

edited

Loading

NicolasHug Oct 17, 2025 •

edited

Loading